Solving Classical String Problems an Compressed Texts

نویسنده

  • Yury Lifshits
چکیده

Here we study the complexity of string problems as a function of the size of a program that generates input. We consider straight-line programs (SLP), since all algorithms on SLP-generated strings could be applied to processing LZ-compressed texts. The main result is a new algorithm for pattern matching when both a text T and a pattern P are presented by SLPs (so-called fully compressed pattern matching problem). We show how to find a first occurrence, count all occurrences, check whether any given position is an occurrence or not in time O(nm). Here m, n are the sizes of straight-line programs generating correspondingly P and T . Then we present polynomial algorithms for computing fingerprint table and compressed representation of all covers (for the first time) and for finding periods of a given compressed string (our algorithm is faster than previously known). On the other hand, we show that computing the Hamming distance between two SLP-generated strings is NPand coNP-hard.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Solving Classical String Problems on Compressed Texts

Here we study the complexity of string problems as a function of the size of a program that generates input. We consider straight-line programs (SLP), since all algorithms on SLP-generated strings could be applied to processing LZ-compressed texts. The main result is a new algorithm for pattern matching when both a text T and a pattern P are presented by SLPs (so-called fully compressed pattern...

متن کامل

Faster Subsequence and Don't-Care Pattern Matching on Compressed Texts

Subsequence pattern matching problems on compressed text were first considered by Cégielski et al. (Window Subsequence Problems for Compressed Texts, Proc. CSR 2006, LNCS 3967, pp. 127–136), where the principal problem is: given a string T represented as a straight line program (SLP) T of size n, a string P of size m, compute the number of minimal subsequence occurrences of P in T . We present ...

متن کامل

Eecient Algorithms for Lempel-ziv Encoding

We consider several basic problems for texts and show that if the input texts are given by their Lempel-Ziv codes then the problems can be solved deterministically in polynomial time in the case when the original (uncompressed) texts are of exponential size. The growing importance of massively stored information requires new approaches to algorithms for compressed texts without decompressing. D...

متن کامل

Querying and Embedding Compressed Texts

The computational complexity of two simple string problems on compressed input strings is considered: the querying problem (What is the symbol at a given position in a given input string?) and the embedding problem (Can the first input string be embedded into the second input string?). Straight-line programs are used for text compression. It is shown that the querying problem becomes P-complete...

متن کامل

Randomized Eecient Algorithms for Compressed Strings: the Finger-print Approach

Denote by LZ(w) the coded form of a string w produced by Lempel-Ziv encoding algorithm. We consider several classical algo-rithmic problems for texts in the compressed setting. The rst of them is the equality-testing: given LZ(w) and integers i; j; k test the equality: wi: : : i + k] = wj : : : j + k]. We give a simple and eecient randomized algorithm for this problem using the nger-printing id...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/cs/0604058  شماره 

صفحات  -

تاریخ انتشار 2006